Developing a Workflow to Identify Inconsistencies in Volunteered Geographic Information: A Phenological Case Study
نویسندگان
چکیده
Recent improvements in online information communication and mobile location-aware technologies have led to the production of large volumes of volunteered geographic information. Widespread, large-scale efforts by volunteers to collect data can inform and drive scientific advances in diverse fields, including ecology and climatology. Traditional workflows to check the quality of such volunteered information can be costly and time consuming as they heavily rely on human interventions. However, identifying factors that can influence data quality, such as inconsistency, is crucial when these data are used in modeling and decision-making frameworks. Recently developed workflows use simple statistical approaches that assume that the majority of the information is consistent. However, this assumption is not generalizable, and ignores underlying geographic and environmental contextual variability that may explain apparent inconsistencies. Here we describe an automated workflow to check inconsistency based on the availability of contextual environmental information for sampling locations. The workflow consists of three steps: (1) dimensionality reduction to facilitate further analysis and interpretation of results, (2) model-based clustering to group observations according to their contextual conditions, and (3) identification of inconsistent observations within each cluster. The workflow was applied to volunteered observations of flowering in common and cloned lilac plants (Syringa vulgaris and Syringa x chinensis) in the United States for the period 1980 to 2013. About 97% of the observations for both common and cloned lilacs were flagged as consistent, indicating that volunteers provided reliable information for this case study. Relative to the original dataset, the exclusion of inconsistent observations changed the apparent rate of change in lilac bloom dates by two days per decade, indicating the importance of inconsistency checking as a key step in data quality assessment for volunteered geographic information. Initiatives that leverage volunteered geographic information can adapt this workflow to improve the quality of their datasets and the robustness of their scientific analyses.
منابع مشابه
Exploring climate change and its impact on agriculture using volunteered geographic information
The PhD research exposed in this paper aims to develop workflows for fine scale study of climate change and its impact on agriculture using volunteered geographic information in phenology. First, a consistency checking workflow was developed to ensure the quality of volunteered observations. Next, by using novel predictors, spatio-temporal variation in plant phenology is modeled so that we can ...
متن کاملAssessment of the completeness of Volunteered Geographic Information focusing on building blocks data (Case Study: Tehran metropolis)
Open Street Map (OSM) is currently the largest collection of volunteered geographic data, widely used in many projects as an alternative to or integrated with authoritative data. However, the quality of these data has been one of the obstacles to the widely use of it. In this article, from among the elements related to the quality of volunteered geographic data, we have tried to examine the com...
متن کاملValidation of Volunteered Geographic Information Landuse Change Using Satellite Imagery
Land use change monitoring is one of the main concerns of managers and urban planners due to human activities and unbalanced physical development in urban areas. In this paper, a combination of remote sensing data and volunteered geographic information was used to assess the quality of volunteered geographic information on land use and land cover changes monitoring. For this purpose, the ORBVIE...
متن کاملMonitoring and Assessing Post-Disaster Tourism Recovery Using Geotagged Social Media Data
Tourism is one of the most economically important industries. It is, however, vulnerable to disaster events. Geotagged social media data, as one of the forms of volunteered geographic information (VGI), has been widely explored to support the prevention, preparation, and response phases of disaster management, while little effort has been put on the recovery phase. This study develops a scienti...
متن کاملConstructing gazetteers from volunteered Big Geo-Data based on Hadoop
Traditional gazetteers are built and maintained by authoritative mapping agencies. In the age of Big Data, it is possible to construct gazetteers in a data-driven approach by mining rich volunteered geographic information (VGI) from the Web. In this research, we build a scalable distributed platform and a high-performance geoprocessing workflow based on the Hadoop ecosystem to harvest crowd-sou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2015